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ABSTRACT 

An  algorithm  for  matching  sets  of  curves  observed  in  an  image  to  curve 
sets  on  3  —  D  model  objects  is  presented.  Both  the  observed  curves  and 
the  model  curves  are  represented  by  sequences  of  sampled  points  along  ap- 
proximations to  the  three  dimensional  curves.  Our  algorithm  finds  the  best 
match  between  curve  sets  by  computing  translation  and  rotation  parame- 
ters, which  when  applied  to  the  observed  curves  will  minimize  the  £2  norm 
between  the  observed  curves  and  the  model  curves. 


1     INTRODUCTION 

In  three  dimensional  model  based  vision  we  are  given  an  image  of  an 
object  which  is  observed  from  an  arbitrary  viewpoint,  possibly  after  partial 
occlusion  by  intervening  objects.  The  goal  is  to  identify  the  object  as  one 
of  a  known  set  of  objects,  and  to  determine  its  position  and  orientation 


in  3-space.  We  will  describe  a  method  which  uses  3  —  Z?  curves  to  char- 
acterize objects  in  a  data  base  of  models,  and  will  present  an  algorithm 
that  uses  such  curves  as  the  basis  for  object  recognition.  Note  that  curves 
in  3-space  carry  a  large  amount  of  information  about  the  scene  they  ap- 
pear in,  and  can  characterize  objects  drawn  from  large  sets  of  candidates. 
The  curves  can  either  be  'painted  curves'  i.e.  curves  that  correspond  to 
changes  of  reflectivity  only,  without  any  rapid  local  depth  changes,  or  can 
be  curves  of  intersection  between  two  surfaces  (either  convex  or  concave), 
curves  of  occlusion  (i.e.  object  boundaries),  and  curves  of  maximum  curva- 
ture ('ridges').  Although  the  information  contained  in  these  curves  is  not 
sufficient  for  object  reconstruction,  they  should  be  sufficient  to  allow  accu- 
rate recognition  of  objects  which  are  not  relatively  smooth  and  featureless 
('egg-shaped'). 

To  recognize  objects  using  3  —  D  curves  it  is  necessary  to  begin  with  the 
true  3  —  D  coordinates  of  points  lying  on  the  curves.  For  this,  range  infor- 
mation is  needed.  Range  data  can  be  gathered  in  many  ways  (see  [BJ85]  for 
a  survey  of  range  gathering  techniques),  and  we  will  not  be  concerned  here 
with  the  data  acquisition  process.  Note  however  that  exclusive  use  of  range 
data  to  accomplish  recognition  is  as  unnecessarily  limited  an  approach  as 
exclusive  use  of  intensity  data,  since  each  of  the  image  classes  contains  in- 
formation not  present  in  the  other.  Our  method  therefore  makes  combined 
use  of  both  range  and  intensity  data  ([DNB79],[BS86])  which  we  assume  to 
be  accurately  registered,  so  that  for  every  pixel  in  the  image  we  have  both 
the  reflected  intensity  and  the  true  x,y,z  coordinates  of  the  surface  at  that 


point.  In  particular  the  Z  —  D  coordinates  of  the  edges  of  a  painted  figure  on 
a  non  planar  object  can  only  be  detected  using  this  combined  data  (Given 
both  kinds  of  data,  we  first  extract  the  edges  of  the  figure  from  the  intensity 
data  and  then  use  the  range  information  to  determine  the  coordinates  of 
these  edges).  Fig  1,  2  and  3  illustrate  this  point  by  showing  an  example 
of  a  painted  flower  pot,  and  some  of  the  curves  that  can  be  extracted  from 
the  image  using  both  sorts  of  data.  Combined  use  of  range  and  intensity 
images  both  improves  the  quality  of  the  edge  detection,  and  can  be  used 
to  characterize  edge  types.  For  example  the  occluding  edge  of  a  coffee  pot 
(characterized  by  a  jump  in  range  value)  is  characterized  by  the  occurrence 
of  discontinuities  in  the  first  derivatives  of  both  the  intensity  and  range 
images,  while  a  roof  edge  (at  which  two  surfaces  meet)  is  characterized  by 
discontinuities  in  the  first  and  second  derivatives  of  the  intensity  and  range 
images  respectively. 

Use  of  curves  rather  than  surface  data  in  model  based  vision  can  in- 
crease the  speed  of  object  recognition  considerably.  Moreover,  if  this  ap- 
proach is  used,  the  data  required  to  characterize  objects  reduces  to  a  group 
of  curves  extracted  from  the  object,  an  extremely  compact  form  as  com- 
pared to  the  full  2  —  D  surface  representation  which  might  otherwise  be 
required  to  represent  real-world  objects.  For  example  the  curves  extracted 
from  the  painted  flower  pot  of  Fig  1  suffice  to  identify  the  pot  and  determine 
its  orientation. 

In  what  follows  we  explain  a  technique  for  matching  observed  curves 
against  object  models  that  has  been  used  in  our  laboratory.     Section  2 


(a) 


(b) 


Figure   1:    (a)   intensity   image  of  a  flower  pot.     (b)   edges  detected   in   the 
intensity  image,  before  cleaning. 


(a) 


(b) 


Figure  2:    (a)   range  image  of  a  flower  pot.     (b)   discontinuities  in   depti 
detected  in  the  range  data. 


Figure  3:  the  intensity  edges  after  cleaning. 

reviews  the  algorithm  for  observed-curve  to  model-curve  matching  devel- 
oped in  [SS85].  Section  3  extends  this  algorithm  to  allow  for  simultaneous 
matching  of  multiple  observed  curves  from  an  image  to  multiple  curves  that 
represent  an  object  model. 


2      Fast  curve  matching 


This  section  reviews  the  technique  for  matching  an  observed  curve  C 
and  a  model  curve  C  presented  in  (SS85],[BSSS86].  The  algorithm  calcu- 
lates the  translation  and  rotation  parameters  that  minimize  the  Euclidean 


distance  between  the  observed  curve  C  and  some  portion  of  a  model  curve 
C .  After  normalization  by  the  length  of  the  curves  the  £2  distance  between 
the  two  curves  computed  by  this  algorithm  can  serve  as  an  indication  for 
the  quality  of  a  match.  In  the  case  where  an  observed  object  is  matched 
against  a  databcise  of  models,  the  model  that  best  fits  the  observed  curve 
will  be  the  one  with  the  smallest  least  square  difference  value. 

The  algorithm  assumes  that  the  two  curves  C  and  C  to  be  matched  are 
parametrized  by  arc  length,  and  so  the  algorithm  is  not  scale  invariant.  In 
two  dimensions  the  problem  limits  the  use  of  the  algorithm  to  a  restricted 
scenario  [KSSS85].  However,  in  three  dimensions  range  data  will  provide 
the  true  3  — Z?  coordinates  of  a  curve  and  so  size  ambiguity  is  not  a  concern. 
The  parametrization  by  arc  length  is  sensative  to  noise,  so  applications  of 
the  algorithm  must  be  proceeded  by  application  of  a  smoothing  technique 
(see  [SS85]). 

More  specifically,  let  the  curves  C  and  C  be  represented  by  a  sequence 
of  evenly  spaced  points  (C/,)"_j  and  (V't)i^i  respectively  such  that  n  <  m 
(the  observed  curve  C  is  a  subset  of  the  model  curve  C).  Matching  of  the 
two  curves  is  equivalent  to  finding  a  Euclidean  motion  E  in  three  space 
that  will  minimize  the  £2  distance  between  (£'f/,)"_i  and  (Vg+,)"_i  for  some 
starting  point  5  on  C,  i.e.  computing 


A(,)  =  minEr=i  \EUi-V,+ 


E 


.1^ 


To  simplify  the  required  calculation  C"  is  translated  so  that  its  center  of 


mass  lies  at  the  origin,  giving 

Next  we  write  E  as  EU  —  RU  +  a  where  i2  is  a  pure  rotation,  and  a  is  a 
translation.  Then 

A(,)  =  min  E.'Li  \  RUi  +  a  -  V,+i  \^ 

Au)  =  minE,"=i(|  a  \^  +  \  V,+i  \^  -2a  •  V,+.+  |  Ui  \^  +2a  ■  RUi  -  2RUiV,+i). 

R,a 

In  this  last  equation  the  variables  a  and  R  appear  separately,  and  so  we 
can  minimize  over  them  independently.  The  minimizing  value  of  a  is 

a  =  -Er=iK+.-. 
n 

To  obtain  R  we  must  compute 

6  =  maxE,"^iiE[/.V,+,-. 
whose  maximum  value  is  easily  seen  to  be 

6  =  tr{A*A)-^. 
where  matrix  A  is  given  by 

A{l,m)^'L^^,Ui{l)V,+i{m)  for  /,m=l,3. 

and  [/,(/)  is  the  /"*  coordinate  of  t/,.  Putting  everything  together  we  have 

A.  =  Er=i(|  f/.  r  +  I  ^.+.  T)  -  -  I  ^?=iV>^i  I'  -2^- 
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The  time  required  to  compute  a  single  value  of  A^  is  0{n)   (where  n  is 

the  number  of  sampled  points  on  the  observed  curve  C).    If  the  observed 

curve  C  is  partially  occluded  and  the  starting  point  5  on  the  model  is 

not  known  then  we  have  to  compute  A,  for  5  from  1  to  m.  The  first  two 

terms  of  A,  can  be  computed  for  every  s  in  time  0{m)  [m  is  the  number  of 

points  on  the  model).  Computing  the  matrices  A  for  every  5  can  be  done  in 

time  0(m  log  m)  using  the  fast  Fourier  transform  to  compute  the  required 

convolutions,  and  hence  the  total  complexity  of  the  matching  operation 

is  0(m log 7n).    To  illustrate  this  method,  Fig  4  shows  an  example  of  an 

observed  curve  in  3  —  Z),  and  a  model  curve.  The  best  match  found  for  the 

two  curves  corresponds  to  a  translation  of  (0.1,-2.5,0.1)  and  the  rotation 

m.atrix 

0.3  0.8  -0.4 

0.1  0.4  0.9 

0.9  0.4  0.1 


3      Simultaneous  matching  of  several  curves 

The  model  of  a  three  dimensional  object  will  usually  contain  several 
curves,  and  so  will  the  image  of  an  observed  object.  To  achieve  object 
recognition  it  is  therefore  appropriate  to  match  several  observed  curves  to 
the  several  curves  known  to  be  present  on  a  model.   A  naive  way  of  per- 
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(a) 


(b) 


(c) 


Figure  4:  (a)  is  an  observed  curve  in  three  space,  (b)  is  the  model  of  the 
object,  (c)  shows  the  translated  and  rotated  observed  curve  overlaid  on  the 
model. 
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forming  this  task  might  be  to  match  every  single  curve  fragment  from  an 
observed  3  —  D  image  against  all  the  models  in  the  database,  combine  all 
the  best  results,  and  hope  to  find  them  all  consistent,  in  the  sense  that 
one  model  and  orientation  matches  all  the  observed  curves  best.  However, 
if  naively  formulated  this  approach  is  not  likely  to  be  acceptable,  since  it 
finds  a  local  minimum  match  for  each  observed  curve,  in  no  way  guaran- 
teeing they  assemble  into  a  global  least  square  difference  between  all  the 
observed  curves  and  the  model.  In  particular,  this  local  minimization  ap- 
proach ignores  the  fact  that  the  group  of  curves  must  lie  on  a  rigid  body 
in  three  space,  and  so  might  compute  completely  different  translations  and 
rotations  for  different  curves.  Fig  5  illustrates  this  point  by  showing  a 
match  of  two  curves  to  a  model,  which  results  in  the  two  separate  'best' 
translations  (—13.2, —18.4, —12.3)  ,  (—16.0,2.7,4.8)  and  two  corresponding 
rotation  matrices. 

0.3     -0.8     -0.4  -0.5    0.8     -0.3 

0.4       0.5       0.7  0.3       0.1    0.9 

-0.8       0.7       0.5  0.8       0.6    0.2 

To  correct  for  these  flaws,  the  following  is  required: 

(a)  We  must  find  a  single  translation  and  rotation  giving  a  global  least 
square  diflperence  between  observed  and  model  curves.  In  eff"ect,  we  want  to 
consider  all  the  observed  curves  as  one  generalized  curve  which  is  matched 
to  the  model  in  a  unitary  fashion. 

(b)  We  must  use  the  known  rigidity  of  the  object  to  be  identified  to 
constrain  the  number  of  matches  that  must  be  tried.  This  is  important 
since  otherwise  the  number  of  possible  matches  might  grow  exponentially 

11 


(a) 


(b) 


(c) 


Figure  5:  (a)  two  observed  curves  in  three  space,  (b)  is  the  model  of  the 
object,  (c)  shows  the  translated  and  rotated  observed  curves  overlaid  on 
the  model. 
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with  the  number  of  observed  curves  fragments  to  be  matched. 

These  two  aims  are  addressed  in  what  follows.  Specifically,  in  Section 
3.1  we  show  how  to  compute  least  square  matches  between  several  curves 
and  a  model,  and  in  Section  3.2  we  show  how  to  use  the  rigidity  constrains 
to  limit  the  collection  of  matches  that  must  be  tried. 


3.1      Formulation 

In  a  multi-curve  matching  problem,  we  are  given  k  observed  curves 
jjU)  for  j  =  1,  k,  each  consisting  of  a  sequence  of  points  U^^' ,i  —  1,  Uj  and  a 
model  curve  (Vi)J^i  such  that  Il*=inj  <  m  (the  observed  curves  are  assumed 
to  be  a  subset  of  the  model  curve).  Assume  for  the  moment  that  we  are 
also  given  k  points  (•Sj)y=i,t  on  the  model  curve  V  as  starting  points  of  the  k 
observed  curves  (we  will  show  later  how  to  find  these  points  among  the  m^ 
possibilities  that  arise).  Our  aim  is  to  find  the  Euclidean  transformation  E 
which  will  minimize  the  £2  distance  between  the  observed  curves  and  the 
model  curve,  or  to  compute 

A  =  minEj^iE"^!  |  EU\'^  -  V,.^i  ^  . 

This  can  be  accomplished  by  straightforward  generalization  of  the  consid- 
erations which  apply  in  the  case  of  single  curve  matching.  To  simplify  the 
calculation  the  observed  curves  are  translated  so  that  their  common  center 

13 


of  mass  lies  at  the  origin,  i.e. 

Next  we  write  E  as  EU  =  RU  +  a  where  i?  is  a  pure  rotation,  and  a  is  a 
translation.  Then 

E 


Since  a  and  R  appear  independently  in  A  they  can  be  minimized  sepa- 
rately. The  minimum  over  a  occurs  when 

rij 

This  is  equivalent  to  translating  the  common  center  of  mass  of  the  curve 
fragments  in  the  model  that  are  being  matched  to  the  center  of  mass  of  the 
collection  of  observed  curves.  To  find  R  we  compute 

(5  =  max  Ej=iE"iii?C/i^V,,+.-. 

or 

(5  =  trE*=i(yl*(^U(^))i 

where  matrix  A^^''  is  given  by 
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where  l,m  denote  the  coordinates  of  the  vectors  (for  l,m  =  1,2,3).  Putting 
everything  together  we  have 

The  time  required  to  compute  A  given  the  k  starting  point  Sj  for  j  =  l,k 
and  to  find  the  translation  and  rotation  parameters  is  0[m)  where  m  is  the 
number  of  sample  points  of  the  model  curve  C. 

If  the  observed  curves  are  partially  occluded  and  the  starting  points 
Sj  on  the  model  are  not  known  then  we  have  to  be  able  to  compute  the 
minimum  of  A  over  all  possible  m*  combinations  of  the  starting  points. 
The  first  two  terms  of  A  can  be  computed  in  time  0{m)  for  all  possible  Sj, 
and  the  sequence  of  matrices  A^^'  can  be  computed  for  all  possible  starting 
points  in  time  0{m  log  m)  for  each  observed  curve  by  using  the  fast  Fourier 
transform  to  compute  the  required  sequence  of  convolutions.  Nevertheless 
finding  minimizing  A  would  require  0{m'')  time  if  we  search. all  the  possi- 
bilities. 


3.2      Use  of  rigidity  constraints. 

The  least  square  distance  function  A  has  k  parameters  Sj  each  of  which 
can  assume  an  integer  value  from  1  to  m,  so  that  minimizing  A  would 
require  time  m*  if  the  parameters  were  independent.  However,  in  the  situ- 
ation which  interests  us,  these  parameters  are  dependent  since  they  repre- 
sent the  positions  of  the  observed  curves  on  the  model  curve.  This  suggests 
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that  we  can  begin  by  finding  the  rotation  and  translation  parameters  for 
one  curve  only,  and  then  can  use  the  other  observed  curves  to  refine  this 
initial  estimate. 

Accordingly  we  designate  one  observed  curve  from  a  group  to  be  marked 
as  a  'base'  curve,  and  relate  the  positions  of  all  other  curves  to  the  resulting 
'bsise'  position.  The  'base'  curve  is  shifted  along  the  model  curve  point  by 
point,  and  the  orientation  which  gives  the  best  match  is  computed  for  every 
point.  It  is  best  to  use  the  longest  observed  curve  as  a  'base'. 

For  every  starting  point  of  the  'base'  curve  we  can  compute  the  rotation 
and  translation  parameters  that  correspond  to  the  best  local  match  between 
the  'base'  curve  and  the  model,  using  the  feist  curve  mathing  technique 
presented  in  Section  2.  Then  we  can  apply  the  same  transformation  to  all 
other  observed  curves,  and  compute  their  relative  positions  on  the  model. 

The  starting-point  coordinates  thereby  computed  are  only  an  approx- 
imation to  the  true  positions  of  these  points  in  a  global  best  match,  and 
may  not  be  on  the  model  itself  but  in  some  distance  from  the  model  curve. 
As  a  result  we  need  to  find  the  point  on  the  model  curve  which  is  nearest 
to  the  computed  starting  point  position.  If  the  distance  between  the  com- 
puted starting  point  and  the  point  found  on  the  model  curve  axeeds  some 
threshold  value  T,  we  reject  it,  and  conclude  that  the  'base'  curve  can't  be 
given  its  assumed  initial  position  on  the  model.  In  this  Ccise,  we  move  to 
the  next  possible  starting  point  on  the  model.  For  this,  a  way  of  answering 
nearest-neighbor  queries  is  needed. 

In  two  dimensions  there  exist  a  solution  to  the  nearest  neighbor  prob- 
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lem  with  0{m  log  m)  preprocessing  and  0(log  m)  query  time  using  Voronoi 
diagrams  [PS85],  but  in  three  dimensions  this  computation  is  more  costly, 
(see  [Cb85]  for  an  0{n^)  preprocessing  and  0{log^n)  query  time  algorithm, 
and  [Ck85]  for  a  probabilistic  algorithm).  However  in  our  application  we 
are  interested  in  a  neighbor  only  if  the  distance  from  the  query  point  to  this 
neighbor  is  smaller  than  a  threshold  T.  We  can  therefore  proceed  as  fol- 
lows. First  sort  the  model  points  coordinates  separately  in  each  dimension 
(this  is  a  preprocessing  step).  Then  in  each  dimension  take  all  points  whose 
distance  from  the  corresponding  query  point  coordinate  is  T  or  less.  This 
will  find  every  point  on  the  model  whose  distance  from  the  query  point  is 
y/ZT.  Although  the  worst-case  complexity  of  this  operation  is  0{m)  since 
it  is  possible  that  all  the  points  on  the  model  curve  should  be  clustered 
together  in  the  neighborhood  of  the  query  point,  in  all  cases  likely  to  arise 
the  number  of  points  on  a  curve  found  in  a  small  region  will  be  very  small, 
so  that  the  more  likely  complexity  of  this  operation  is  O(logm)  per  query. 

From  the  points  found  by  the  search  (if  any),  the  one  point  which  is 
closest  to  the  query  is  chosen  to  be  the  starting  point  on  the  model  for  the 
observed  curve. 

Applying  this  procedure  to  all  observed  curves  (other  then  the  'base' 
curve)  we  find  all  A;  starting  points  on  the  model  si...Sk,  and  compute  an 
instance  of  A.  (As  noted  previously  if  the  algorithm  fails  to  position  any 
of  the  observed  curves  on  the  model,  the  position  of  the  'base'  curve  is 
determined  to  be  wrong.)  Overall  there  will  be  m  possible  positions  for 
the  'base'  curve,  and  for  each  one  of  them  the  algorithm  performs  k  binary 
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searches  on  the  sorted  model  (one  for  each  observed  curve)  leading  to  an 
expected  complexity  0{km\ogm)  for  the  whole  procedure. 

Fig  6  shows  the  results  of  applying  this  procedure  to  simultaneous 
matching  of  three  observed  curves  to  a  model  curve. 
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(a) 


(b) 


(c) 


Figure  6:   (a)  observed  curves  in  three  space,  (b)  is  the  model  of  the  object. 
(c)  shows  the  translated  and  rotated  observed  curves  overlaid  on  the  model. 
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